Getting Started

ANU BDSI
workshop
Introduction to R programming

Emi Tanaka

Biological Data Science Institute

3rd April 2024

Welcome 👋

Teaching team

Dr. Emi Tanaka

Helper TBD
  • Who are you?
    • What statistical software have you used before?
    • Introduce yourself to people around you

Workshop materials

All materials will be hosted at
https://anu-bdsi.github.io/workshop-intro-R/

Learning objectives

The main aim is for you to get started with using R for basic computations.

  • Conduct elementary arithmetic operations using R
  • Grasp the concept of missing values within the R environment
  • Compute basic summary statistics including mean, median, quartiles, and standard deviation using R
  • Install external packages in R to extend functionality
  • Manipulate lists, matrices, and vectors in R
  • Navigate the RStudio interactive development environment (IDE)
  • Import and export data in R
  • Comprehend various object types in R
  • Create basic functions, employ conditional statements, and utilize for loops in R
  • Decipher error messages and do basic troubleshooting

What is R?

  • R is a programming language predominately for data analysis
  • RStudio Desktop is an integrated development environment (IDE) that helps you to use R

How to use R?

  • RStudio Desktop (or RStudio IDE) is the most common way to use R

  • You can type operations directly into the Console pane

Live demo

Customise Global Options

  • Go to RStudio > Tools > Global Options…
  • Under the General tab, make sure the “Restore .RData into workspace at startup” is unticked.
  • This avoids unexpectedly loading (old) data into your workspace and making your code only work in your workspace, but not for others (which is bad reproducible practice).

R Packages

  • R packages are community developed extensions to R (much like apps on your mobile)
  • The Comprehensive R Archive Network (CRAN) is a volunteer maintained repository that hosts submitted R packages that are approved (much like an app store)
    • There are close to 20,000 packages available on CRAN
    • The qualities of R packages vary
  • There are other repositories that host R packages, e.g. Bioconductor for bioinformatics, R Universe, R-Forge, GitHub (we won’t cover these)

Photo by Sara Kurfeß on Unsplash

Why learn R?

  • R is one of the top programming languages for statistics or data science
    • Python is also a good alternative language for data science
    • Better to have a mastery of at least one language rather than none
  • R was initially developed by statisticians for statisticians
    • State-of-the-art statistical methods are generally more readily available in R
  • R has an active and friendly community
  • R is a free and open source software (FOSS)
    • free = money is not a barrier to use it
    • open source software = transparency

How to get better at R?

  • PRACTICE
  • Practice with a purpose (e.g. using R on your own data)
  • Try teaching and helping others with their R problem
  • Have a willingness to continuously learn and adapt
    • R is an ever evolving language (check the release news every so often)
    • new features and packages are added very frequently
    • whether you are a beginner or not, there are always things we do not know about R
  • Do you have any strategies or tips? Please share!